feat(aiohttp): Add span streaming support#6179
Conversation
Migrate the aiohttp integration to support the new span-first streaming model alongside the existing transaction-based approach. When trace_lifecycle is set to "stream", server and client spans use the StreamedSpan API with OTel-style attributes. The legacy path remains unchanged. Also adds AnnotatedValue.substituted_because_over_size_limit() for body-size-limit annotations and comprehensive tests covering headers, request bodies, transaction styles, error paths, and trace propagation under span streaming. It's being introduced in #6178 but added here in order to be able to start leveraging it right away.
|
bugbot run |
|
@sentry review |
Codecov Results 📊✅ 13 passed | Total: 13 | Pass Rate: 100% | Execution Time: 10.15s All tests are passing successfully. ❌ Patch coverage is 10.64%. Project has 14991 uncovered lines. Files with missing lines (2)
Generated by Codecov Action |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 2dc9255. Configure here.
| url_query = ( | ||
| {"url.query": request.query_string} | ||
| if request.query_string | ||
| else {} | ||
| ) | ||
|
|
||
| client_address = ( | ||
| { | ||
| "client.address": request.remote, | ||
| "user.ip_address": request.remote, | ||
| } | ||
| if should_send_default_pii() and request.remote | ||
| else {} | ||
| ) |
There was a problem hiding this comment.
This pattern was followed so that if there are no values in the query string or in the client address, we are not adding empty strings to the envelope.
| raise | ||
| except (asyncio.CancelledError, ConnectionResetError): | ||
| if isinstance(span, StreamedSpan): | ||
| span.status = SpanStatus.ERROR.value |
There was a problem hiding this comment.
Note that despite a NoOpStreamedSpan doing nothing when status is set on it, this is fine because ultimately these types of spans aren't sampled or are being ignored.
There was a problem hiding this comment.
Yup this would be my preferred way to work with StreamedSpans and NoOpStreamedSpans -- just using the method and not caring what the actual class is, since the class itself will take care of it.
(At least whenever possible -- which is not always, especially if something else depends on the result of calling the method, or if something external needs to happen only if a span is sampled (like the span needing to be set on the scope).)
| ) | ||
| if body_data is not None: | ||
| span._segment.set_attribute( | ||
| "http.request.body.data", body_data |
There was a problem hiding this comment.
As this is currently written, there is a potential for this to be an empty string instead of a value - specifically in the case of when an AnnotatedValue was returned in get_aiohttp_request_data.
There's a pull request that's open in order to ensure that this gets a value set on it in those situations, which will also have the benefit of making it clearer to users on why a value was replaced (i.e. too large, raw data that was present)
| legacy_span.set_data(SPANDATA.HTTP_QUERY, parsed_url.query) | ||
| legacy_span.set_data(SPANDATA.HTTP_FRAGMENT, parsed_url.fragment) |
There was a problem hiding this comment.
These were not updated from http to url in order to preserve existing behaviour
There was a problem hiding this comment.
The changes here were added in from #6178 so they could be used right away
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 551e1d7. Configure here.
…tsentry/sentry-python into py-2298-migrate-aiohttp-to-span-first
| # There are a number of sub-exceptions that should have an "ok" status, | ||
| # but will actually have a status of error once the `raise` happens below | ||
| # and we pass through the `should_be_treated_as_error` method invoked | ||
| # within a span's `__exit__` method. | ||
| # | ||
| # As a result, made this behaviour explicit | ||
| # so people don't think that the "ok" status persists in that scenario. | ||
| # | ||
| # Although not obvious, this overwriting behaviour occurs in the legacy | ||
| # approach as well, so this matches what we do today. |
There was a problem hiding this comment.
There are a number of sub-exceptions that should have an "ok" status,
but will actually have a status of error once theraisehappens below
and we pass through theshould_be_treated_as_errormethod invoked
within a span's__exit__method.
Would it be a lot of effort to fix this? I imagine we'd need to add some status code parsing to should_be_treated_as_error? Because the current situation def sounds like a bug (so we could also fix it for the legacy code path)
There was a problem hiding this comment.
This is trickier than it seems, at the heart of the issue is how to deal with raised exceptions that are used for certain framework behaviours that aren't necessarily "errors". In this case, these HTTPExceptions are used to create the final HTTP response, whether they're 200s or 500s.
We could call __end__ on the StreamedSpan manually to avoid this prior to the raise, but that would mean risking loss of data/spans if things happen afterwards.
We could also add status code parsing, as you had suggested, within should_be_treated_as_error, but I worry that there could be unintentional side effects.
For instance, could the SDK could be running in a non-web framework application such that a custom status code meant to represent an error is a success HTTP status code (200, etc.) or an exit code that's not an error but is one in HTTP?

Migrate the aiohttp integration to support the new span-first streaming model alongside the existing transaction-based approach.
Fixes PY-2298
Fixes #5996